104 research outputs found

    TRANSPATH®—A High Quality Database Focused on Signal Transduction

    Get PDF
    TRANSPATH® can either be used as an encyclopedia, for both specific and general information on signal transduction, or can serve as a network analyser. Therefore, three modules have been created: the first one is the data, which have been manually extracted, mostly from the primary literature; the second is PathwayBuilder™, which provides several different types of network visualization and hence faciliates understanding; the third is ArrayAnalyzer™, which is particularly suited to gene expression array interpretation, and is able to identify key molecules within signalling networks (potential drug targets). These key molecules could be responsible for the coordinated regulation of downstream events. Manual data extraction focuses on direct reactions between signalling molecules and the experimental evidence for them, including species of genes/proteins used in individual experiments, experimental systems, materials and methods. This combination of materials and methods is used in TRANSPATH® to assign a quality value to each experimentally proven reaction, which reflects the probability that this reaction would happen under physiological conditions. Another important feature in TRANSPATH® is the inclusion of transcription factor–gene relations, which are transferred from TRANSFAC®, a database focused on transcription regulation and transcription factors. Since interactions between molecules are mainly direct, this allows a complete and stepwise pathway reconstruction from ligands to regulated genes. More information is available at www.biobase.de/pages/products/databases.html

    Short sequence motifs, overrepresented in mammalian conserved non-coding sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A substantial fraction of non-coding DNA sequences of multicellular eukaryotes is under selective constraint. In particular, ~5% of the human genome consists of conserved non-coding sequences (CNSs). CNSs differ from other genomic sequences in their nucleotide composition and must play important functional roles, which mostly remain obscure.</p> <p>Results</p> <p>We investigated relative abundances of short sequence motifs in all human CNSs present in the human/mouse whole-genome alignments <it>vs</it>. three background sets of sequences: (i) weakly conserved or unconserved non-coding sequences (non-CNSs); (ii) near-promoter sequences (located between nucleotides -500 and -1500, relative to a start of transcription); and (iii) random sequences with the same nucleotide composition as that of CNSs. When compared to non-CNSs and near-promoter sequences, CNSs possess an excess of AT-rich motifs, often containing runs of identical nucleotides. In contrast, when compared to random sequences, CNSs contain an excess of GC-rich motifs which, however, lack CpG dinucleotides. Thus, abundance of short sequence motifs in human CNSs, taken as a whole, is mostly determined by their overall compositional properties and not by overrepresentation of any specific short motifs. These properties are: (i) high AT-content of CNSs, (ii) a tendency, probably due to context-dependent mutation, of A's and T's to clump, (iii) presence of short GC-rich regions, and (iv) avoidance of CpG contexts, due to their hypermutability. Only a small number of short motifs, overrepresented in all human CNSs are similar to binding sites of transcription factors from the FOX family.</p> <p>Conclusion</p> <p>Human CNSs as a whole appear to be too broad a class of sequences to possess strong footprints of any short sequence-specific functions. Such footprints should be studied at the level of functional subclasses of CNSs, such as those which flank genes with a particular pattern of expression. Overall properties of CNSs are affected by patterns in mutation, suggesting that selection which causes their conservation is not always very strong.</p

    Genome wide prediction of HNF4α functional binding sites by the use of local and global sequence context

    Get PDF
    An application of machine learning algorithms enables prediction of the functional context of transcription factor binding sites in the human genome

    TRANSPATH(®): an information resource for storing and visualizing signaling pathways and their pathological aberrations

    Get PDF
    TRANSPATH(®) is a database about signal transduction events. It provides information about signaling molecules, their reactions and the pathways these reactions constitute. The representation of signaling molecules is organized in a number of orthogonal hierarchies reflecting the classification of the molecules, their species-specific or generic features, and their post-translational modifications. Reactions are similarly hierarchically organized in a three-layer architecture, differentiating between reactions that are evidenced by individual publications, generalizations of these reactions to construct species-independent ‘reference pathways’ and the ‘semantic projections’ of these pathways. A number of search and browse options allow easy access to the database contents, which can be visualized with the tool PathwayBuilder™. The module PathoSign adds data about pathologically relevant mutations in signaling components, including their genotypes and phenotypes. TRANSPATH(®) and PathoSign can be used as encyclopaedia, in the educational process, for vizualization and modeling of signal transduction networks and for the analysis of gene expression data. TRANSPATH(®) Public 6.0 is freely accessible for users from non-profit organizations under

    FeatureScan: revealing property-dependent similarity of nucleotide sequences

    Get PDF
    FeatureScan is a software package aiming to reveal novel types of DNA sequence similarity by comparing physico-chemical properties. Thirty-eight different parameters of DNA double strands such as charge, melting enthalpy, conformational parameters and the like are provided. As input FeatureScan requires two sequences, a pattern sequence and a target sequence, search conditions are set by selecting a specific DNA parameter and a threshold value. Search results are displayed in FASTA format and directly linked to external genome databases/browsers (ENSEMBL, NCBI, UCSC). An Internet version of FeatureScan is accessible at . As part of the HOBIT initiative () FeatureScan is also accessible as a web service at its above home page. Currently, several preloaded genomes are provided at this Internet website (Homo sapiens, Mus musculus, Rattus norvegicus and four strains of Escherichia coli) as target sequences. Standalone executables of FeatureScan are available on request

    Walking pathways with positive feedback loops reveal DNA methylation

    Get PDF
    Background: the search for molecular biomarkers of early-onset colorectal cancer (CRC) is an important but still quite challenging and unsolved task. Detection of CpG methylation in human DNA obtained from blood or stool has been proposed as a promising approach to a noninvasive early diagnosis of CRC. Thousands of abnormally methylated CpG positions in CRC genomes are often located in non-coding parts of genes. Novel bioinformatic methods are thus urgently needed for multi-omics data analysis to reveal causative biomarkers with a potential driver role in early stages of cancer. Methods: we have developed a method for finding potential causal relationships between epigenetic changes (DNA methylations) in gene regulatory regions that affect transcription factor binding sites (TFBS) and gene expression changes. This method also considers the topology of the involved signal transduction pathways and searches for positive feedback loops that may cause the carcinogenic aberrations in gene expression. We call this method 'Walking pathways', since it searches for potential rewiring mechanisms in cancer pathways due to dynamic changes in the DNA methylation status of important gene regulatory regions ('epigenomic walking'). Results: in this paper, we analysed an extensive collection of full genome gene-expression data (RNA-seq) and DNA methylation data of genomic CpG islands (using Illumina methylation arrays) generated from a sample of tumor and normal gut epithelial tissues of 300 patients with colorectal cancer (at different stages of the disease) (data generated in the EU-supported SysCol project). Identification of potential epigenetic biomarkers of DNA methylation was performed using the fully automatic multi-omics analysis web service 'My Genome Enhancer' (MGE) (my-genome-enhancer.com). MGE uses the database on gene regulation TRANSFAC®, the signal transduction pathways database TRANSPATH®, and software that employs AI (artificial intelligence) methods for the analysis of cancer-specific enhancers. Conclusions: the identified biomarkers underwent experimental testing on an independent set of blood samples from patients with colorectal cancer. As a result, using advanced methods of statistics and machine learning, a minimum set of 6 biomarkers was selected, which together achieve the best cancer detection potential. The markers include hypermethylated positions in regulatory regions of the following genes: CALCA, ENO1, MYC, PDX1, TCF7, ZNF43

    Advanced Computational Biology Methods Identify Molecular Switches for Malignancy in an EGF Mouse Model of Liver Cancer

    Get PDF
    The molecular causes by which the epidermal growth factor receptor tyrosine kinase induces malignant transformation are largely unknown. To better understand EGFs' transforming capacity whole genome scans were applied to a transgenic mouse model of liver cancer and subjected to advanced methods of computational analysis to construct de novo gene regulatory networks based on a combination of sequence analysis and entrained graph-topological algorithms. Here we identified transcription factors, processes, key nodes and molecules to connect as yet unknown interacting partners at the level of protein-DNA interaction. Many of those could be confirmed by electromobility band shift assay at recognition sites of gene specific promoters and by western blotting of nuclear proteins. A novel cellular regulatory circuitry could therefore be proposed that connects cell cycle regulated genes with components of the EGF signaling pathway. Promoter analysis of differentially expressed genes suggested the majority of regulated transcription factors to display specificity to either the pre-tumor or the tumor state. Subsequent search for signal transduction key nodes upstream of the identified transcription factors and their targets suggested the insulin-like growth factor pathway to render the tumor cells independent of EGF receptor activity. Notably, expression of IGF2 in addition to many components of this pathway was highly upregulated in tumors. Together, we propose a switch in autocrine signaling to foster tumor growth that was initially triggered by EGF and demonstrate the knowledge gain form promoter analysis combined with upstream key node identification

    Functional classification of proteins based on projection of amino acid sequences: application for prediction of protein kinase substrates

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The knowledge about proteins with specific interaction capacity to the protein partners is very important for the modeling of cell signaling networks. However, the experimentally-derived data are sufficiently not complete for the reconstruction of signaling pathways. This problem can be solved by the network enrichment with predicted protein interactions. The previously published <it>in silico </it>method PAAS was applied for prediction of interactions between protein kinases and their substrates.</p> <p>Results</p> <p>We used the method for recognition of the protein classes defined by the interaction with the same protein partners. 1021 protein kinase substrates classified by 45 kinases were extracted from the Phospho.ELM database and used as a training set. The reasonable accuracy of prediction calculated by leave-one-out cross validation procedure was observed in the majority of kinase-specificity classes. The random multiple splitting of the studied set onto the test and training set had also led to satisfactory results. The kinase substrate specificity for 186 proteins extracted from TRANSPATH<sup>® </sup>database was predicted by PAAS method. Several kinase-substrate interactions described in this database were correctly predicted. Using the previously developed ExPlain™ system for the reconstruction of signal transduction pathways, we showed that addition of the newly predicted interactions enabled us to find the possible path between signal trigger, TNF-alpha, and its target genes in the cell.</p> <p>Conclusions</p> <p>It was shown that the predictions of protein kinase substrates by PAAS were suitable for the enrichment of signaling pathway networks and identification of the novel signaling pathways. The on-line version of PAAS for prediction of protein kinase substrates is freely available at <url>http://www.ibmc.msk.ru/PAAS/</url>.</p

    Role of Phagocytosis in the Pro-Inflammatory Response in LDL-Induced Foam Cell Formation; a Transcriptome Analysis

    Get PDF
    Excessive accumulation of lipid inclusions in the arterial wall cells (foam cell formation) caused by modified low-density lipoprotein (LDL) is the earliest and most noticeable manifestation of atherosclerosis. The mechanisms of foam cell formation are not fully understood and can involve altered lipid uptake, impaired lipid metabolism, or both. Recently, we have identified the top 10 master regulators that were involved in the accumulation of cholesterol in cultured macrophages induced by the incubation with modified LDL. It was found that most of the identified master regulators were related to the regulation of the inflammatory immune response, but not to lipid metabolism. A possible explanation for this unexpected result is a stimulation of the phagocytic activity of macrophages by modified LDL particle associates that have a relatively large size. In the current study, we investigated gene regulation in macrophages using transcriptome analysis to test the hypothesis that the primary event occurring upon the interaction of modified LDL and macrophages is the stimulation of phagocytosis, which subsequently triggers the pro-inflammatory immune response. We identified genes that were up- or downregulated following the exposure of cultured cells to modified LDL or latex beads (inert phagocytosis stimulators). Most of the identified master regulators were involved in the innate immune response, and some of them were encoding major pro-inflammatory proteins. The obtained results indicated that pro-inflammatory response to phagocytosis stimulation precedes the accumulation of intracellular lipids and possibly contributes to the formation of foam cells. In this way, the currently recognized hypothesis that the accumulation of lipids triggers the pro-inflammatory response was not confirmed. Comparative analysis of master regulators revealed similarities in the genetic regulation of the interaction of macrophages with naturally occurring LDL and desialylated LDL. Oxidized and desialylated LDL affected a different spectrum of genes than naturally occurring LDL. These observations suggest that desialylation is the most important modification of LDL occurring in vivo. Thus, modified LDL caused the gene regulation characteristic of the stimulation of phagocytosis. Additionally, the knock-down effect of five master regulators, such as IL15, EIF2AK3, F2RL1, TSPYL2, and ANXA1, on intracellular lipid accumulation was tested. We knocked down these genes in primary macrophages derived from human monocytes. The addition of atherogenic naturally occurring LDL caused a significant accumulation of cholesterol in the control cells. The knock-down of the EIF2AK3 and IL15 genes completely prevented cholesterol accumulation in cultured macrophages. The knock-down of the ANXA1 gene caused a further decrease in cholesterol content in cultured macrophages. At the same time, knock-down of F2RL1 and TSPYL2 did not cause an effect. The results obtained allowed us to explain in which way the inflammatory response and the accumulation of cholesterol are related confirming our hypothesis of atherogenesis development based on the following viewpoints: LDL particles undergo atherogenic modifications that, in turn, accompanied by the formation of self-associates; large LDL associates stimulate phagocytosis; as a result of phagocytosis stimulation, pro-inflammatory molecules are secreted; these molecules cause or at least contribute to the accumulation of intracellular cholesterol. Therefore, it became obvious that the primary event in this sequence is not the accumulation of cholesterol but an inflammatory response

    CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation

    Get PDF
    Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory ‘grammar’, or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila
    corecore